A Ranking Approach to Stress Prediction for Letter-to-Phoneme Conversion

نویسندگان

  • Qing Dou
  • Shane Bergsma
  • Sittichai Jiampojamarn
  • Grzegorz Kondrak
چکیده

Correct stress placement is important in text-to-speech systems, in terms of both the overall accuracy and the naturalness of pronunciation. In this paper, we formulate stress assignment as a sequence prediction problem. We represent words as sequences of substrings, and use the substrings as features in a Support Vector Machine (SVM) ranker, which is trained to rank possible stress patterns. The ranking approach facilitates inclusion of arbitrary features over both the input sequence and output stress pattern. Our system advances the current state-of-the-art, predicting primary stress in English, German, and Dutch with up to 98% word accuracy on phonemes, and 96% on letters. The system is also highly accurate in predicting secondary stress. Finally, when applied in tandem with an L2P system, it substantially reduces the word error rate when predicting both phonemes and stress.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-Strategy Approach to Improving Pronunciation by Analogy

Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included "full" pattern matching between input letter string and dictionary entries, as well as including lexical stress in letter-to-phoneme conversion. Second, w...

متن کامل

Letter-to-Phoneme Conversion for a German Text-to-Speech System

This thesis deals with the conversion from letters to phonemes, syllabification and word stress assignment for a German text-to-speech system. In the first part of the thesis (chapter 5), several alternative approaches for morphological segmentation are analysed and the benefit of such a morphological preprocessing component is evaluated with respect to the grapheme-to-phoneme conversion algori...

متن کامل

An evaluation of non-standard features for grapheme-to-phoneme conversion

Machine learning methods for grapheme-to-phoneme (G2P) conversion are popular, but the features used in the literature are most often simply a window of context letters, despite the availability of other features. In this paper, a set of features beyond the seven-letter window, termed non-standard features, are systematically evaluated for American English, using decision trees. The results sho...

متن کامل

Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion

Letter-to-phoneme conversion generally requires aligned training data of letters and phonemes. Typically, the alignments are limited to one-to-one alignments. We present a novel technique of training with many-to-many alignments. A letter chunking bigram prediction manages double letters and double phonemes automatically as opposed to preprocessing with fixed lists. We also apply an HMM method ...

متن کامل

Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion

We present a discriminative structureprediction model for the letter-to-phoneme task, a crucial step in text-to-speech processing. Our method encompasses three tasks that have been previously handled separately: input segmentation, phoneme prediction, and sequence modeling. The key idea is online discriminative training, which updates parameters according to a comparison of the current system o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009